Legacy Crawler

Step 3. Running the Legacy Crawler

Before we run the crawler, let's make sure it can write its results so that we can see it working! It'd be embarrassing to show this to your manager only to realize it's not fully functional!

First, find the "main" program file at src/main/java/com/udacity/webcrawler/main/WebCrawlerMain.java. You should find two TODOs there: one for writing the crawl output, and one for writing the profile output.

Don't worry about the profile output yet — you'll get to that later. For now, complete the first TODO using the output path stored in the config field. You will have to use the CrawlResultWriter class that you just wrote. Create an instance of CrawlResultWriter by passing in the CrawlResult (the code that creates the CrawlResult is already written).

Next, check the value of config.getResultPath(). If it's a non-empty string, create a Path using config.getResultPath() as the file name, then pass that Path to the CrawlResultWriter#write(Path) method.

Alternatively, if the value of config.getResultPath() is empty, the results should be printed to standard output (also known as System.out).

Hint: There may be a standard Writer implementation in java.io (*cough* OutputStreamWriter *cough*) that converts System.out into a Writer that can be passed to CrawlResultWriter#write(Writer).

Next, build the project (skipping tests, since they shouldn't all pass yet):

mvn package -Dmaven.test.skip=true

Finally, run the legacy crawler using the sample configuration file included with the project:

java -classpath target/udacity-webcrawler-1.0.jar \
    com.udacity.webcrawler.main.WebCrawlerMain \
    src/main/config/sample_config_sequential.json

Was the JSON result printed to the terminal?

Next Concept